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Abstract — Consider a Gaussian relay network where a number 
of sources communicate to a destination with the help of several 
layers of relays. Recent work has shown that a compress-and- 
forward based strategy at the relays can achieve the capacity 
of this network within an additive gap. In this strategy, the 
relays quantize their observations at the noise level and map 
it to a random Gaussian codebook. The resultant capacity gap 
is independent of the SNR's of the channels in the network but 
linear in the total number of nodes. 

In this paper, we show that if the relays quantize their signals 
at a resolution decreasing with the number of nodes in the 
network, the additive gap to capacity can be made logarithmic in 
the number of nodes for a class of layered, time-varying wireless 
relay networks. This suggests that the rule-of-thumb to quantize 
the received signals at the noise level used for compress-and- 
forward in the current literature can be highly suboptimal. 

I. Introduction 

Consider a source node communicating to a destination 
node via a sequence of relays connected by point-to-point 
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channels. See Figure 1(a) The capacity of this line network 
is achieved by simple decode-and-forward and is equal to 
the minimum of the capacities of the successive point-to- 
point links. The decoding at each stage removes the noise 
corrupting the information signal and therefore the end-to- 
end rate achieved is independent of the number of times the 
message gets retransmitted. 

Unfortunately, the optimality of decode-and-forward is lim- 
ited to this line topology and in more general networks with 
multiple relays at each layer, it is well-understood that the 
rate achieved by decode-and-forward can be arbitrarily away 
from capacity. Recent work by Avestimehr et al [ 1 ] has shown 
that compress-and-forward can be a better fit for general relay 
networks. In any relay network with multi-source multicast 
traffic, it has been shown that a compress-and-forward based 
relaying strategy can achieve the capacity of the network 
within a gap that is independent of the SNR's of the constituent 
channels (TJ, 0, 0. However, the gap to capacity increases 
linearly in the number of nodes in the network. For example, 



for the line network in Figure 1(a) it would lead to a gap that 



is linear in the depth of the network D. One natural way to 
explain this gap is the noise accumulation. As the information 
signal proceeds deeper into the network, it is corrupted by 
more and more noise. Therefore, any strategy that does not 
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Fig. 1: (a) Line Network, (b) Multi -Layer Relay Network for 
K = 3, each Hi is a Rayleigh fading matrix 

remove the noise corrupting the signal at each stage will 
naturally suffer a rate loss that increases with the number of 
stages. However, it is not clear why this rate loss should be 
linear in the depth of the network as the current results in 
the literature suggest fT), 0, 0. The total variance of the 
accumulated noise over the D stages of the network is D times 
the variance of the noise at each stage (assuming identical 
noise variances over the D stages). A factor of D increase 
in the noise variance in a point-to-point Gaussian channel 
would lead to a log D decrease in capacity, and therefore it is 
natural to ask if we can reduce the linear performance loss of 
compress-and-forward strategies to logarithmic in D. 

This paper is based on the observation that if the relay nodes 



in Figure 1(a) quantize their observed signals at a resolution 
decreasing linearly in D, the rate loss due to compress and 



forward is only logarithmic in D. (See Section III ) This 
suggests that the rule-of-thumb to quantize the received signals 
at the noise level used for compress-and-forward in the current 
literature Q], 0, can be highly suboptimal. This is because 
the rate penalty for describing the quantized signals can 
be significantly larger than the rate penalty associated with 
coarser quantization. This insight was used in [4| to show 
that compress-and-forward based strategies can achieve the 
capacity of the A^-relay Gaussian diamond network within a 
gap that is logarithmic in N. 

The main setup we consider in this paper is the multi-layer 



Gaussian relay network in Figure 1(b) Here K source nodes 



communicate to a destination node equipped with multiple 
antennas over D layers, each layer containing K single- 
antenna relays. Each relay observes a noisy linear combination 
of the signals transmitted by the relays in the previous layer. 
All channels are subject to i.i.d. Rayleigh fast-fading. Current 
results on compress-and-forward (T], |(2], yield a sum-rate 
which is within 1.3 KD gap to the capacity of this network, 
where KD is the total number of nodes. Instead, we show that 
if relays quantize their received signals at a resolution that 
decreases as the number of nodes increases, compress-and- 
forward can achieve a sum-rate which is within an additive 
gap of K log D + K of the network sum-capacity. So for a 
fixed K, as the number of layers D increases, this gap only 
grows logarithmically in the depth of the network D (and 
therefore logarithmically in the number of nodes KD). As 
a side result, we provide an analysis of the compress-and- 
forward based strategies in 0], (5), in fast fading wireless 
networks. 

This same setup has been considered in [5], where a 
computation alignment strategy is proposed to remove the 
accumulating noise with the depth of the network. This yields 
a gap 7K 3 + 5K log K. The computation alignment strategy 
is based on the idea of combining compute-forward |6| with 
ergodic alignment proposed in |7|. While the gap to capacity 
obtained by computation alignment is independent of D, this 
strategy is significantly more complex than compress-forward 
and has a number of disadvantages from a practical perspec- 
tive. In particular, ergodic alignment over the fading process 
leads to large delays in communication and requires each relay 
to know the instantaneous realizations of all the channels in 
the network. Moreover, its performance critically depends on 
the symmetry of the fading statistics. The compress-forward 
strategy with improved quantization we propose in this paper 
has minimal requirements. In particular, no channel state 
information is required at the source and at the relays, and 
the fading statistics are not critical to the operation of the 
strategy. 



II. Model and Preliminaries 



A. Model 



We consider the configuration shown in Figure 1(b) The 



network is a directed layered network, each layer except the 
last containing K nodes. The nodes in the ith layer are 
collectively referred to as Vi where < i < D. Nodes in 
Vo are the K source nodes {sj}jL x , having messages at rate 
Rj to be communicated to the single destination node d in 
Vd, which has K antennas. Since Vd only contains d, we 
use d and Vd interchangeably in the sequel. We assume that 
d is equipped with multiple antennas in order to keep the 
problem interesting. Otherwise, the minimum cut becomes the 
multiple-input-single-output cut from the last layer of relays to 
d and this trivializes the problem of approximately achieving 
the capacity of the network. Instead of multiple antennas at d, 
one can also assume orthogonal bit-pipes from nodes in Vd-i 



to d, as done in . Let V 1 denote V U Vi U • ■ ■ U Vi and N 
denote the set of all nodes, i.e. Af = V D . 

For < i < D — 1, the received signal at nodes in Vi+i (or 
antennas if i = D — 1) depends only on the transmit signals 
of nodes in Vi and at time t is given by 



Y Vi+1 [t] 



Hv i ^v i+1 [t]X v% [t] + Z Vz+l [t] , 



where Yy i+1 and Xy. are vectors containing the received and 
transmitted signals at nodes in Vi+i and Vi respectively; and 
Z\> i+1 ~ CAf(0,a 2 I), i.e. we assume flat-fading channels 
between the nodes with i.i.d. circularly symmetric complex 
Gaussian noise. The (k, ?)'th entry of the matrix Hy i ^,y i+1 [t] 
denotes the channel coefficient from Tth relay in V, to fc'th 
relay in Vj+i at time t. We further assume that channels 
are i.i.d. Rayleigh fading, i.e each entry in the matrices 
{H Vo ^ Vl [t],H Vl ^ V2 [t],...,H VD _^ d [t]} is i.i.d. CA/"(0,1) 
across time, and independent of other entries and independent 
of the noise and transmissions. (The conclusions of the paper 
also hold under a block fading model.) All transmitting nodes 
are subject to a long-term average power constraint P. We can 
assume that Yy = and Xd = 0. The source nodes and the 
relay nodes do not know the instantaneous realizations of the 
channel coefficients, i.e have no transmit or receive channel 
state information. (The source nodes know the topology of the 
network and the channel statistics, i.e. the end-to-end ergodic 
rate supported by the network.) All channel realizations are 
known at the destination node and are used while decoding 
the transmitted messages from the source nodes. The largest 
achievable sum-rate X) 7 =i Rj m tne network is called the sum- 
capacity of the network, denoted by C sum . 

B. Preliminaries 

A cut il is a subset of Af. Let H[t] be a random vector 
containing all the channel realizations in the network. Since 
the channel realizations are known at the destination d, we 
can view H[t] as part of the output of d at time t, i.e., at 
time t, d observes Yd[t] and H[i\. Note that this does not alter 
the memorylessness property of the network. For the sake of 
notational convenience in the proofs, we define the following 
quantity for a cut fl, 



C(Q.):=I(X n ;Yn*,H\X n c) 
= I{Xn;Y a o\Xno,H) 



(1) 



where Xj\f are jointly distributed with some distribution such 
that the average power constraints are satisfied. The second 
equality follows from the fact that I(Xq; H \X&a) — since 
the distribution of X^\ d is independent of H (H is unknown 
to all nodes but the destination) and Xy d = 0. With this 
notation, the information-theoretic cutset upper bound |H] 
Theorem 15.10.1] on the achievable rates in the network 
can be expressed as follows: If each source Sj can reliably 
communicate at a rate Rj simultaneously, then there exists 



some joint distribution p(X^f) on Xj^ such that 

Y^ Rj < C(Q) for all cuts Q. 



(2) 



III. Line Network 
We first illustrate the main idea of this paper in a simple 



setting, the line network in Figure 1(a) Here we assume 
that each link i is a AWGN channel with gain hi and the 
channel gains hi are fixed and known. Each node has power 
P and the noise variance is a 2 . (The conclusions below 
also hold under a fast fading assumption similar to the one 
described in Section [II]) As mentioned before, a decode- 
forward strategy at the relays achieves the capacity of this line 
network, while compress-and-forward based strategies (such 
as quantize-map-forward in [1] and noisy network coding in 
0) with quantization done at the noise level have a gap to 
capacity that is linear in the number of nodes D. Here, we 
show that if relays instead quantize at (D — 1) times the noise 
level, the gap to capacity becomes logarithmic in D. 

Number the nodes s through d as 0,1,2, ... ,D. Let's 
consider the rate achievable by noisy network coding for this 
network, assuming all relay nodes choose their transmission 
codebooks independently from a Gaussian distribution, i.e. 
Xi ~ CN(Q,P) and independent of each other. Theorem 1 
in Q says that the following rate is achievable 
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where we are assuming that the destination node also performs 
quantization for simplicity. 

Now, let each relay choose Yl = Yj + Z t where Zj ~ 
Af(0, (D — l)c 2 ) independent of everything else. Since 
Yj+i = hiXi + Zi + i, the channel from Xj to Y+i is 
effectively an AWGN channel of noise power Da 2 with gain 
hi. Then the first term in the achievable rate expression 
becomes log ( 1 - L 
IM 2 ^ 



log 1 



Da 

- log(D) 



which is greater than or equal to 



Due to the coarse quantization, the second term in the 
achievable rate expression is reduced significantly as compared 
to quantizing at the noise level. We have 

I{Y vi ;Y vi \X N ,Y^ vi ) 
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since i < D — l. Since the capacity of the line network is given 
by the minimum of the capacities of each link: mini log(l + 
\hi\ 2 P), we see that decreasing the resolution of quantization 
as the number of nodes increases results in a gap of log(D) + 
1. If the quantization were done at the noise level, the first 



term in the noisy network coding achievable rate would suffer 
from only a log(2) decrease instead of log(D) with respect 
to capacity, however the second term would be linear in D, 
overall resulting in a linear gap in D to capacity. 

IV. Layered Network with Multiple Relays 
The main result of this paper is the following theorem. 



Theorem 1. The sum-capacity of the network in Figure \l(bj\ 
is bounded by 

C(K, K)-K log(D) -K< C sum < C{K, K) (3) 

where the lower bound is achievable by a compress-and- 
forward strategy with appropriately chosen quantization levels. 
C(K, K) denotes the ergodic capacity of a K-by-K MIMO 
Rayleigh fast-fading channel with per-antenna average power 
constraint of P and noise variance a 2 and is equal to the 
information-theoretic cutset upper bound on the sum-capacity 
of the network. 

We first prove Theorem [T] for the case when the K source 
nodes {s\, . . . , sk} are co-located, i.e. {s 1; . . . , sk} behave 
like a single source denoted by s with K antennas, with a per- 
antenna power constraint P, transmitting a message at rate R 
to the destination d, see Figure 15] In this case, we show that the 
point-to-point capacity satisfies the conditions in Theorem [T] 
At the end of this section, we extend the proof for the capacity 
in the single-source case to the sum-capacity in the original 
setup containing multiple sources. 

We prove Theorem [T] for the single source setup in two 
steps. We first establish the upper bound on the capacity in 
Section |IV-A| and then show that it is achievable within a gap 
Klog(D) + K in Section [iV^B] 

A. Upper bound 

The upper bound in Theorem [T] is easy to prove. Consider 
the cutset upper bound in |2]l for the single source case: 

R< min I(X n ;Yno\X n a,H). 

n:sen,den<= 

Considering only the cut A = Vo implies that 
R < 



max min C(O) 

p(Xm) n-.sen,den c 

< max C(Vo) 

v(Xm) 

= max I(X Vo ; Y^\ v \ A AA y Vo , H) 



(4) 



= E logdet 

= C(K,K), 



:PH X 



v 1 H Vo ^ Vi 



where (a) follows from the fact that the maximal mutual 
information in the earlier line corresponds to the ergodic 
capacity of a K x K MIMO Rayleigh fast-fading channel with 
per-antenna average power constraint P and the maximizing 
input distribution for this channel is well known to be i.i.d. 
CAf(0,P) 0. We denote this capacity by C(K,K). 



Remark: The cutset upper bound in |2]i bounds the rate 
with many additional constraints arising from cuts other than 
Vo- In the above derivation, by concentrating on a single 
cut A = Vo we have derived an upper bound Q on the 
cutset bound. Although such an upper bound can be weaker 
in general, in the current case it can be shown that C(K, K) 
is indeed the tightest constraint on the rate imposed by the 
cutset bound. This can be observed from the discussion in the 
next section (Claim fTh, which implicitly shows that the cutset 
bound on the rate evaluated under i.i.d distributions is equal to 
C(K, K). Since the cutset bound evaluated under a particular 
distribution forms a lower bound on the actual bound obtained 
from Q, this shows that the tightest constraint imposed on the 
rate by the cutset bound is exactly equal to C(K,K). 

B. Achievability 

We now prove the lower bound in Theorem [T] We start with 
the rate achieved by noisy network coding in |2 Theorem 1], 
which states that all rates R that satisfy 



R< min I(X n -Y n ^H\X n c 
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-I{Yn;Yn\X^,Ync,H) 
I(X Q ;Ync\Xnc,H) 

-I(Y n ;Yn\X^,Ync,H) 



for some joint distribution of the form 
Ilk£AfP(xk)p(yk\yk,Xk) are achievable. The equality 
again follows from the fact that I(Xq; H\Xqc) = 0. Hence, 
the following R is achievable: 

R< min I(Xn;Y n c,H\X n c) 

- max I(Yn;Y n \X^,Yao,H). (5) 

We choose the input distribution X k at each node k to be 
CAf(Q, P) (and similarly the input distributions corresponding 
to the antennas of the source node are i.i.d. CAf(0,P)). We 
choose Yfc such that % = Y k + Z k where Z k is C7V(0, (D - 
1)<7 2 ) independent of everything else. Note the difference with 
the quantization in QJ, Q, 0: the quantization noise has 
variance (D — l)a 2 as opposed to a 2 , the noise variance. For 
simplicity, we also assume that the destination quantizes its 
observation according to Y\> D = Yy D + Zy D . where Zy D ~ 
CJ\f(Q, (D — 1)<t 2 ) independent of everything else, and treats 
Y\> D (denoted by Y^ for brevity) as its observation, along with 
all the channel realizations H. 

We will evaluate the right-hand side of |5]) in two steps. In 
LemmafT] we upper bound the second term by K. In LemmaE] 
we lower bound the first term by C(K, AT)— log D. Combining 
the two results gives the lower bound in Theorem [T] (for the 
single source case). 




Fig. 2: Links crossing the cut £1 denoted by dashed lines; 

Mi = 2, M 2 = 1, M 3 = 2, M 4 = 

Lemma 1. 

max I(Yn;Y n \Xx, Y n e,H) < K 

n:sen,defi c 

Proof: Given our choice for the distributions of the 
random variables involved, we have 



H) 



I(Y n ;Y n \X^,Y^,H) 

= h(Y n \X M , Y Q c,H) - h(Y n \Y n , Xx, Y n c 

< h(Y n \Xx, H) - h{Y n \Y n: X M , H) 

= (|fi| - 1) log^a 2 ) - (|0| - 1) \og((D - l)a 2 ) 

1 ^<K. 



< K{D - 1) log 1 



D-l 



Hence max n ., se n,den<= I(Ya;Ya\X^f,Yac,H) < K. ■ 

We next lower bound the first term in (J5j. 

Lemma 2. 

min I{X a ; Ync \X n < , H) > C(K, K) - K log D 
Proof: We first prove the following relation: 
Claim 1. 

min I(Xn;Ync\X a o,H) = C(K,K). 

For notational convenience in this proof, we define 
Ci.i.d.(ty '■= I(Xn;Y n c\X n c,H), where we emphasize that 
the inputs are i.i.d. CAf(0,P) via the subscript "i.i.d.". 

Consider a cut 51 that contains Mi nodes from Vi , M 2 from 
V2 and so on until Md-i from Vd-i (see Figure |2l. Recall 
that we assume s € CI and d G fl c . Then Cj i d (0) is given 
by 



E 



logdet / 



P 



rt 



-^H n ^^H n ^ ni 



where Hq^qc is a block diagonal matrix containing blocks 
of size Aff-by-AT, Af 2 c -by-AA, AAr-by-AA:, • • ■ , Afg_ r by- 
Md-2 and finally K-by-Mrj-i. We have abused notation here 
by defining M? := ]V*] - M t = K - M % . 

Since the capacity of a MIMO channel that has block 
diagonal structure is the sum of the capacities of the individual 
MIMO blocks, we have 

P 



c LLd .(n) = E 



log det I 



-^Ha_yQcH Q ^. nt 



= C(Af x c , K) + C(M°, AAJ + C(Af 3 c , M 2 )+ 

• • • + C{M c D _ x ,M D _z) + C{K, M D _ X ) (*) 



We show below that (*) > C(K, K). Note the following 
properties of the function C(x,y): 

a) C(x,y) =C(y,x), 

b) C(z,y) >C(x,y) if z > x, 

c) C(x, y) + C(K — x, y) > C(K, y) which can be shown 
via an application of Hadamard's inequality. 

Proving that the expression in (*) > C(K, K) is just a matter 
of applying these properties multiple times. For concreteness, 
we show this for the case D = 4 below, which can be 
generalized in a straightforward way to higher values of D. 

(*) = C(M?, K) + C(M%, Mi) + C(M 3 C , M 2 ) + C(K, M 3 ) 

> C(M{, K) + C(M%, Mi) + a(M 3 c , M 2 ) + C(M 2 ,M 3 ) 

> C(M?, K) + C(M%, Mi) + C(K, M 2 ) 

> C[M{, K) + C(M 2 C , Mi) + C(M U M 2 ) 

> <7(Aff, J5Q + C^ff, Mi) 
>C(K,K), 

where the first inequality follows by applying property (b) to 
the last term in the first line, the second inequality follows by 
applying (c) to the last two terms in the earlier line etc. So 
we have shown that 



collocated. A more formal argument can be made as follows. 



min C LLd .{Q,)>C(K,K). 



(6) 



The cuts V°, V 1 , . . . , V ^ 1 satisfy |6]l with equality, so we are 
done. (Each of these cuts induces a K-by-K MIMO channel 
across the cut, i.e. Cj.j.d. (V 1 ) = C(K,K) for any < i < 
D — 1.) This proves Claim [T] 

Due to our choice of the quantization: Y = Y + Z where 
Z ~ a/V(0, D-l), evaluating the term I(X n ;Y n c\X n .,H) is 
equivalent to evaluating I(Xq; Yhc|Xnc, H) except that now 
the noise is Z + Z instead of just Z, i.e. the noise power is 
Da 2 instead of a 2 . Hence, 

min I(X n ;Y n c\X Q c,H) 

n-.seit.defi" 



= E 



loadct / 



P 



:H\ 



M 



>C(K,K)-Klog(D), 
This concludes the proof of the lemma. 



(7) 



C. Proof of Theorem U\ 

Via Lemma[2]and LemmafT] we have proved TheoremfTlfor 
the case of a single if-antenna source. We now show that the 
same result holds for the sum-capacity in the original setup 
containing K single-antenna sources. 

" '"His 



It is clear that the upper bound established in Section IV- A 



an upper bound on the achievable sum-rate for the K sources. 
For the lower bound, we observe that since in the above 
discussion we have chosen i.i.d. input distributions for the 
antennas at the source, we can apply the same strategy and 
therefore achieve the same total rate even if antennas are not 



Consider the setup with K sources as shown in Figure 1(b) 



We fix the operation of the relays to be the same as that 



described in Section IV-B This induces a multiple access 
channel between the sources si, S2, . . . , Sk and the destination 
d described by a certain pdf p{y,i,H\xv ). It is well known 
that the achievable rate region for a memoryless MAC channel 
is a polymatroid and the largest achievable sum-rate is given 
by I(X Vo ;Y d ,H) where p(x Vo ) = Y\f =1 p{x Sj ) since the 
transmitting nodes can cannot cooperate. If we fix p(x s ) to 
be the C7V(0, P) pdf for all 1 < j < K , then I(X Vo ;Y d , H) 
is the same end-to-end mutual information that we obtain 
in the case of a single source with K antennas using the 



achievability scheme in Section IV-B Thus, the lower bound 
on the capacity for a single source with K antennas that we 
proved in Lemma [2] also applies to the sum-capacity in the 
case of K single antenna sources. This completes the proof 
of Theorem Q] ■ 

Remark: We point out that Theorem [T] continues to hold 
if there are multiple destination nodes in the final layer, each 
having K antennas and interested in all the messages. 

V Concluding Remarks 

In this paper, we have considered a time-varying Gaussian 
relay network in which K sources communicate to a destina- 
tion over multiple layers of relays, each layer containing K 
nodes. We have shown that by better choosing the quantization 
level in the compress-and-forward strategies, we can improve 
the gap to capacity from linear to logarithmic in the depth of 
the network. This is obtained by decreasing the resolution of 
quantization as the number of nodes in the network increases, 
which decreases the associated rate penalty to communicate 
the quantization codewords to the destination. 
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